Using Collocations to Assess MT Quality
نویسندگان
چکیده
Conventional metrics for Machine Translation evaluation have focused on using n-gram similarity between a reference translation and a system translation as an indication of the system quality. A simple n-gram model however cannot capture long-distance dependency, and the requirement of a reference translation has prevented the use of these metrics at the decoding stage. In this paper we propose a set of collocation-based metrics to address these problems. A series of experiments has shown that these metrics are capable of distinguishing between human translations and system translations, and can produce system rankings comparable to the other metrics without requiring a reference translation.
منابع مشابه
Collocations in a Rule-Based MT System: A Case Study Evaluation of Their Translation Adequacy
Collocations constitute a subclass of multi-word expressions that are particularly problematic for machine translation, due 1) to their omnipresence in texts, and 2) to their morpho-syntactic properties, allowing virtually unlimited variation and leading to long-distance dependencies. Since existing MT systems incorporate mostly local information, these are arguably ill-suited for handling thos...
متن کاملCustomizing Complex Lexical Entries for High-Quality MT
The customization of Machine Translation systems concentrates, for the most part, on MT dictionaries. In this paper, we focus on the customization of complex lexical entries that involve various types of lexical collocations, such as sub-categorization frames. We describe methods and tools that leverage existing parsers and other MT dictionaries for customization of MT dictionaries. This custom...
متن کاملTowards the Automatic Acquisition of Lexical Selection Rules
This paper is a study of a certain type of collocations and implication and application to acquisition of lexical selection rules in transfer-approach MT systems. Collocations reveal the co-occurrence possibilities of linguistic units in one language, which often require lexical selection rules to enhance the natural flow and clarity of MT output. The study presents an automatic acquisition and...
متن کاملCollocational Processing in Two Languages: A psycholinguistic comparison of monolinguals and bilinguals
With the renewed interest in the field of second language learning for the knowledge of collocating words, research findings in favour of holistic processing of formulaic language could support the idea that these language units facilitate efficient language processing. This study investigated the difference between processing of a first language (L1) and a second language (L2) of congruent col...
متن کاملCollocations, Dictionaries and MT
Collocations pose specific problems in translation (both human and machine translation). For the native speaker of English it may be obvious that you ’pay attention’, but for a native speaker of Dutch it would have been much simpler if in English people ’donated attention.’ Within an MT system, we can deal with these mismatches in different ways. Simply adding the entry to our bilingual diction...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005